Many opportunities, and a few challenges
March 13, 2025
In divergent thinking tests, fluency scores are the count of unique responses provided by a respondent to an item.
Traditionnally, analyses of fluency scores rely on classical test theory (e.g., sum scores, \(\alpha\), traditionnal factor analysis).
All classical test theory models assume:
2PPCM (Myszkowski & Storme, 2021): the fluency score is drawn from a Poisson distribution: from a Poisson distribution…
\[X_{ij} \sim \text{Poisson}(e^{a_j\theta_i + b_j})\] …with the rate/expectation (and variance) given by:
Maximum likelihood approaches:
Mplus, Stata)
countirt)
lme4)
Can a general purpose Bayesian estimation framework do better?
Is it feasible to estimate log-linear count IRT models in a Bayesian framework with packages non-dedicated to count IRT?
Can we obtain results similar to maximum likelihood estimates?
Are there benefits to this approach? Are they easily attainable and useful?
What are the (current) limitations?
Inputs
Process
Output
Stan and brmsStan (Carpenter et al., 2017): A programming language suited for Bayesian using Hamiltonian Monte Carlo (HMC) estimation.
brms (Bürkner, 2017): An R package to estimate various models in Stan using regression-like syntax (e.g., y ~ x1 + x2)
Stan syntaxIt’s not too bad ! See our paper (Myszkowski & Storme, 2025)…
…but here’s a quick look !
Publicly available dataset for special issue (Forthmann et al., 2019)
202 respondents (variable Person)
3 alternate uses tasks (rope, paperclip, garbage bag) (variable Item)
formula_2PPCM <- bf(
Score ~ 0 + slope * theta + easiness, #linear part of the item response model
theta ~ 0 + (1 | Person), #Theta is a random effect of the person
slope ~ 0 + Item, #Slope is a fixed effect of the item
easiness ~ 0 + Item, #Easiness is a fixed effect of the item
nl = TRUE, #Activate <Weird Model Mode>
family = poisson(link = "log") #Log-linear Poisson model
)Full posterior distributions of all parameters.
What is the probability that finding alternate uses of a rope is easier than of a garbage bag?
What is the probability that person 1’s fluency is more than 1 standard deviation higher than person 2’s fluency?
Treat item characteristics (e.g. slope) as random deviations from a shared distribution.
Item covariates (e.g., object type):
formula_2PPCM_expl <- bf(
Score ~ 0 + slope * theta + easiness,
theta ~ 0 + (1 | Person),
slope ~ 0 + Item,
easiness ~ 0 + object_type,
nl = TRUE,
family = poisson(link = "log")
)Person covariates (i.e. latent regression/latent mean differences)
By using informative priors, we can avoid unrealistic parameter values and unstable models.
Probably useful for extensions (e.g., avoiding differential item functionning false positives).
Particularly useful in small datasets that we don’t want to trust “maximally”.
We can combine multiple models and/or multiple sets of priors using Bayesian stacking
For example, we can obtain \(\theta\) posterior distributions from different models, which we average, weighted by their to their fit.
Avoids reliance on a single model/set of priors, leading to more robust predictions.
In this dataset, default priors were sufficient for the RPCM, but not for the 2PPCM.
Find this presentation at https://osf.io/9f4eu/.